Agnostic Learning by Refuting∗
نویسندگان
چکیده
The sample complexity of learning a Boolean-valued function class is precisely characterized by its Rademacher complexity. This has little bearing, however, on the sample complexity of efficient agnostic learning. We introduce refutation complexity, a natural computational analog of Rademacher complexity of a Boolean concept class and show that it exactly characterizes the sample complexity of efficient agnostic learning. Informally, refutation complexity of a class C is the minimum number of example-label pairs required to efficiently distinguish between the case that the labels correlate with the evaluation of some member of C (structure) and the case where the labels are i.i.d. Rademacher random variables (noise). The easy direction of this relationship was implicitly used in the recent framework for improper PAC learning lower bounds of Daniely and co-authors [6, 8, 10] via connections to the hardness of refuting random constraint satisfaction problems. Our work can be seen as making the relationship between agnostic learning and refutation implicit in their work into an explicit equivalence. In a recent, independent work, Salil Vadhan [25] discovered a similar relationship between refutation and PAC-learning in the realizable (i.e. noiseless) case. 1998 ACM Subject Classification I.2.6 Learning
منابع مشابه
Learning by Refuting
The sample complexity of learning a Boolean-valued function class is precisely characterized by its Rademacher complexity. This has little bearing, however, on the sample complexity of efficient agnostic learning. We introduce refutation complexity, a natural computational analog of Rademacher complexity of a Boolean concept class and show that it exactly characterizes the sample complexity of ...
متن کاملRefuting Learning Revisited
We consider, within the framework of inductive inference, the concept of refuting learning as introduced by Mukouchi and Arikawa, where the learner is not only required to learn all concepts in a given class but also has to explicitly refute concepts outside the class. In the rst part of the paper, we consider learning from text and introduce a concept of limit-refuting learning that is interme...
متن کاملMore data speeds up training time in learning halfspaces over sparse vectors
The increased availability of data in recent years has led several authors to ask whether it is possible to use data as a computational resource. That is, if more data is available, beyond the sample complexity limit, is it possible to use the extra examples to speed up the computation time required to perform the learning task? We give the first positive answer to this question for a natural s...
متن کاملDistribution-Specific Agnostic Boosting
We consider the problem of boosting the accuracy of weak learning algorithms in the agnostic learning framework of Haussler (1992) and Kearns et al. (1992). Known algorithms for this problem (BenDavid et al., 2001; Gavinsky, 2002; Kalai et al. , 2008) follow the same strategy as boosting algorithms in the PAC model: the weak learner is executed on the same target function but over different dis...
متن کاملOn The Power of Membership Queries in Agnostic Learning On The Power of Membership Queries in Agnostic Learning∗
We study the properties of the agnostic learning framework of Haussler (1992) and Kearns, Schapire, and Sellie (1994). In particular, we address the question: is there any situation in which membership queries are useful in agnostic learning? Our results show that the answer is negative for distribution-independent agnostic learning and positive for agnostic learning with respect to a specific ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018